Semi-supervised Learning for Multi-target Regression

نویسندگان

  • Jurica Levatic
  • Michelangelo Ceci
  • Dragi Kocev
  • Saso Dzeroski
چکیده

The most common machine learning approach is supervised learning, which uses labeled data for building predictive models. However, in many practical problems, the availability of annotated data is limited due to the expensive, tedious and time-consuming annotation procedure. At the same, unlabeled data can be easily available in large amounts. This is especially pronounced for predictive modelling problems with structured output space. Semi-supervised learning (SSL) aims to use unlabeled data as an additional source of information in order to build better predictive models than can be learned from labeled data alone. The majority of work in SSL considers the simple tasks of classification and regression where the output space consists of a single variable. Much less work has been done on SSL for structured output prediction. In this study, we address the task of multi-target regression (MTR), a type of structured output where the output space consists of multiple numerical values. Our main objective is to investigate whether we can improve over supervised methods for MTR by using unlabeled data. We use ensembles of predictive clustering trees in a self-training fashion: most reliable predictions on unlabeled data are iteratively used to re-train the model. We use variance of an ensemble models as an indicator of the reliability of predictions. Our results provide a proof-of-concept: Unlabeled data improves predictive performance of ensembles for multi-target regression, however further efforts are needed to automatically select the optimal threshold for reliability of predictions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-Supervised Multi-Task Regression

Labeled data are needed for many machine learning applications but the amount available in some applications is scarce. Semi-supervised learning and multi-task learning are two of the approaches that have been proposed to alleviate this problem. In this paper, we seek to integrate these two approaches for regression applications. We first propose a new supervised multi-task regression method ca...

متن کامل

Semi-supervised Regression with Order Preferences

Following a discussion on the general form of regularization for semi-supervised learning, we propose a semi-supervised regression algorithm. It is based on the assumption that we have certain order preferences on unlabeled data (e.g., point x1 has a larger target value than x2). Semi-supervised learning consists of enforcing the order preferences as regularization in a risk minimization framew...

متن کامل

Active + Semi-supervised Learning = Robust Multi-View Learning

In a multi-view problem, the features of the domain can be partitioned into disjoint subsets (views) that are sufficient to learn the target concept. Semi-supervised, multi-view algorithms, which reduce the amount of labeled data required for learning, rely on the assumptions that the views are compatible and uncorrelated (i.e., every example is identically labeled by the target concepts in eac...

متن کامل

Semi-supervised learning by search of optimal target vector

We introduce a semi-supervised learning estimator which tends to the first kernel principal component as the number of labeled points vanishes. We show application of the proposed method for dimensionality reduction and develop a semi-supervised regression and classification algorithm for transductive inference. 2007 Elsevier B.V. All rights reserved.

متن کامل

Target Localization in Wireless Sensor Networks Using Online Semi-Supervised Support Vector Regression

Machine learning has been successfully used for target localization in wireless sensor networks (WSNs) due to its accurate and robust estimation against highly nonlinear and noisy sensor measurement. For efficient and adaptive learning, this paper introduces online semi-supervised support vector regression (OSS-SVR). The first advantage of the proposed algorithm is that, based on semi-supervise...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014